Fix: changing request_queue_id everytime!#13
Conversation
the reason is, crawlee is caching all the requests and everytime it's being re-initialized, it would re-fetch previous enqueud links and not the new ones still no good way of removing the request_queue is not provided by crawlee community.
WalkthroughThis pull request updates the initialization process of the crawler in the Changes
Sequence Diagram(s)sequenceDiagram
participant Client as CrawleeClient
participant UUID as uuid.uuid4
Client->>Client: Initialize with configuration
Note right of Client: Set purge_on_start = True
Client->>UUID: Generate unique queue ID
UUID-->>Client: Return new UUID (hex)
Client->>Client: Assign default_request_queue_id
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🧰 Additional context used🧬 Code Definitions (1)tests/unit/test_website_etl.py (2)
⏰ Context from checks skipped due to timeout of 90000ms (2)
🔇 Additional comments (4)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
the reason is, crawlee is caching all the requests and everytime it's being re-initialized, it would re-fetch previous enqueud links and not the new ones
still no good way of removing the request_queue is not provided by crawlee community.
Summary by CodeRabbit